Knowledge-based Systems and Interestingness Measures: Analysis with Clinical Datasets

نویسندگان

  • J. Jabez Christopher
  • Harichandran Khanna Nehemiah
  • Arputharaj Kannan
چکیده

Knowledge mined from clinical data can be used for medical diagnosis and prognosis. By improving the quality of knowledge base, the efficiency of prediction of a knowledge-based system can be enhanced. Designing accurate and precise clinical decision support systems, which use the mined knowledge, is still a broad area of research. This work analyses the variation in classification accuracy for such knowledge-based systems using different rule lists. The purpose of this work is not to improve the prediction accuracy of a decision support system, but analyze the factors that influence the efficiency and design of the knowledge base in a rule-based decision support system. Three benchmark medical datasets are used. Rules are extracted using a supervised machine learning algorithm (PART). Each rule in the ruleset is validated using nine frequently used rule interestingness measures. After calculating the measure values, the rule lists are used for performance evaluation. Experimental results show variation in classification accuracy for different rule lists. Confidence and Laplace measures yield relatively superior accuracy: 81.188% for heart disease dataset and 78.255% for diabetes dataset. The accuracy of the knowledge-based prediction system is predominantly dependent on the organization of the ruleset. Rule length needs to be considered when deciding the rule ordering. Subset of a rule, or combination of rule elements, may form new rules and sometimes be a member of the rule list. Redundant rules should be eliminated. Prior knowledge about the domain will enable knowledge engineers to design a better knowledge base.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Selecting a Right Interestingness Measure for Rare Association Rules

In the literature, the properties of several interestingness measures have been analyzed and a framework has been proposed for selecting a right interestingness measure for extracting association rules. As rare association rules contain useful knowledge, researchers are making efforts to investigate efficient approaches to extract the same. In this paper, we make an effort to analyze the proper...

متن کامل

Interestingness Measures for Association Rules in a KDD Process : PostProcessing of Rules with ARQAT Tool

This work takes place in the framework of Knowledge Discovery in Databases (KDD), often called ”Data Mining”. This domain is both a main research topic and an application field in companies. KDD aims at discovering previously unknown and useful knowledge in large databases. In the last decade many researches have been published about association rules, which are frequently used in data mining. ...

متن کامل

A Graph-based Clustering Approach to Evaluate Interestingness Measures: A Tool and a Comparative Study

Finding interestingness measures to evaluate association rules has become an important knowledge quality issue in KDD. Many interestingness measures may be found in the literature, and many authors have discussed and compared interestingness properties in order to improve the choice of the most suitable measures for a given application. As interestingness depends both on the data structure and ...

متن کامل

Interestingness Measures for Rare Association Rules and Periodic-Frequent Patterns

Data mining is the process of discovering significant and potentially useful knowledge in the form of patterns from the data. As a result, the notion of interestingness is very important for extracting useful knowledge patterns. Numerous interestingness measures have been discussed in the literature to assess the interestingness of a knowledge pattern. In this thesis, we focus on selecting a ri...

متن کامل

Ranking the Interestingness of Summaries from Data Mining Systems

We study data rn~rdng where the task is description by summarization, the representation language is generalized relations, the evaluation criteria are based on heuristic measures of interestingness, and the method for searching is the Multi-Attribute Generalization algorithm for domain generalization graphs. We present and empirically compare four heuristics for ranking the interestingness of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CIT

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2016